INRIA Bordeaux - VizWhiz
VAST 2008 Challenge
Mini Challenge 3:  Cell Phone Calls 

Authors and Affiliations:

Romain Bourqui, LaBRI, bourqui@labri.fr [PRIMARY contact]
Frederic Gilbert, INRIA Bordeaux, Frederic.pierre.gilbert@inria.fr
Umang Sharan, INRIA Bordeaux, umangsh@gmail.com
Paolo Simonetto, LaBRI, paolo.simonetto@labri.fr
Faraz Zaidi, LaBRI, faraz.zaidi@labri.fr

 

Student team: YES

Tool(s):

Tulip (http://www.tulip-software.org): A visualization system created by David Auber to draw and display huge graphs, allowing the navigation through geometric operations as well as the extraction of subgraphs and the enhancement of the results obtained by filtering.

 

 

Two Page Summary:   YES

 

      Summary file

 

ANSWERS:


Phone-1: What is the Catalano/Vidro social network, as reflected in the cell phone call data, at the end of the time period

 

      Phone Nodes (Node names 1 and 3 are interchangeable – we didn’t find sufficient data to differentiate between those two important nodes)

 

      Phone Links

 


Phone-2  Characterize the changes in the Catalano/Vidro social structure over the ten day period.

Detailed Answer:

We began our exploration by importing the dataset into Tulip. We found that the call/location graphs remain the same for the first two days, then change for days 3 and 4, revert to the original state for days 5,6,7 and then change drastically towards days 8,9,10. The following figures expose the important nodes within the Catalano call network structure based on the delta-efficiency metric [1,2] applied to the call graphs. Tulip provides an efficient platform for exploring and navigating large graphs, an extendible plugin mechanism to apply metrics and a set of pre-defined visualization algorithms for graph layout.

 

 

 

 

We infer the most important nodes in the network by their delta-efficiency scores. To determine the boss, we apply the maximal neighborhood intersection metric to the most important nodes just determined. Once we know the boss, we apply Kruskal’s MST to infer the hierarchy from the network structure. We obtain the following hierarchy for the social network by using call graphs from days 1,2,5,6,7 where we believe that the social network remained the same. We conjecture that node 200 is the boss and nodes 1,3,5 are important nodes in the network. Contrast this with the network shown by the image corresponding to day 9 above where nodes 306 and 309, which were not important previously, have suddenly risen in importance.

 

 

Next, to analyze the temporal behavior of the social network over the 10 days, we clustered and filtered the location graphs as described in the summary paper. We first construct the call graph corresponding to each day. Then, for each daily call graph, we add position edges between two nodes if they have called someone from the same cell tower during the day. Each position edges (u,v) is weighted by the minimum number of calls of u or v from the corresponding cell tower. Thereafter, we filter each graph based on the edge weights with weight thresholds varying from 1 through 5. Observe that the graph corresponding to the threshold 5 contains really few position edges and is composed almost entirely of call edges. For each of the 50 graphs thus generated, we apply a clustering algorithm based on the strength metric ([3]).

 

The above figure shows the temporal evolution of community structure in Tulip. The top two figures show the call graph and location graph for the entire span of 10 days respectively. The left sub-figure on the bottom row shows the clustering algorithm applied to the temporal filters (for formal definitions, refer to the summary file) for each day. The cluster similarity coefficient varies from least similar (yellow edges) to very similar (blue edges). The snapshot on the bottom right shows the different communities inferred in the Catalano social structure – edges have been omitted for brevity. A more intuitive node-link visualization for the community structure is shown below.

 

 


 

[1] N. Memon, D. L. Hicks, and H. L. Larsen. How investigative data mining can help intelligence agencies to discover dependence of nodes in terrorist networks. In ADMA, pages 430–441, 2007.

 

[2] N. Memon and H. L. Larsen. Practical approaches for analysis, visualization and destabilizing terrorist networks. In ARES ’06: Proceedings of the First International Conference on Availability, Reliability and Security, pages 906–913, Washington, DC, USA, 2006. IEEE Computer Society.

 

[3] D . Auber, Y. Chiricota, F. Jourdan, and G. Melancon. Multiscale visualization of small world networks. In INFOVIS '03: Proceedings of the IEEE Symposium on Information Visualization (INFOVIS'03), pages 75-81, 2003.